智能论文笔记

Artificial Intelligence Security Competition (AISC)

Yinpeng Dong , Peng Chen , Senyou Deng , Lianji L , Yi Sun , Hanyu Zhao , Jiaxing Li , Yunteng Tan , Xinyu Liu , Yangyi Dong

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-07

The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.

translated by 谷歌翻译

FAS-UNet: A Novel FAS-driven Unet to Learn Variational Image Segmentation

Hui Zhu , Shi Shu , Jianping Zhang

分类：计算机视觉 | 人工智能 | 机器学习

2022-10-27

Solving variational image segmentation problems with hidden physics is often expensive and requires different algorithms and manually tunes model parameter. The deep learning methods based on the U-Net structure have obtained outstanding performances in many different medical image segmentation tasks, but designing such networks requires a lot of parameters and training data, not always available for practical problems. In this paper, inspired by traditional multi-phase convexity Mumford-Shah variational model and full approximation scheme (FAS) solving the nonlinear systems, we propose a novel variational-model-informed network (denoted as FAS-Unet) that exploits the model and algorithm priors to extract the multi-scale features. The proposed model-informed network integrates image data and mathematical models, and implements them through learning a few convolution kernels. Based on the variational theory and FAS algorithm, we first design a feature extraction sub-network (FAS-Solution module) to solve the model-driven nonlinear systems, where a skip-connection is employed to fuse the multi-scale features. Secondly, we further design a convolution block to fuse the extracted features from the previous stage, resulting in the final segmentation possibility. Experimental results on three different medical image segmentation tasks show that the proposed FAS-Unet is very competitive with other state-of-the-art methods in qualitative, quantitative and model complexity evaluations. Moreover, it may also be possible to train specialized network architectures that automatically satisfy some of the mathematical and physical laws in other image problems for better accuracy, faster training and improved generalization.The code is available at \url{https://github.com/zhuhui100/FASUNet}.

translated by 谷歌翻译

Energy and Spectrum Efficient Federated Learning via High-Precision Over-the-Air Computation

Liang Li , Chenpei Huang , Dian Shi , Hao Wang , Xiangwei Zhou , Minglei Shu , Miao Pan

分类：机器学习 | 人工智能

2022-08-15

联合学习（FL）使移动设备能够在保留本地数据的同时协作学习共享的预测模型。但是，实际上在移动设备上部署FL存在两个主要的研究挑战：（i）频繁的无线梯度更新v.s.频谱资源有限，以及（ii）培训期间渴望的FL通信和本地计算V.S.电池约束的移动设备。为了应对这些挑战，在本文中，我们提出了一种新型的多位空天空计算（MAIRCOMP）方法，用于FL中本地模型更新的频谱有效聚合，并进一步介绍用于移动的能源有效的FL设计设备。具体而言，高精度数字调制方案是在MAIRCOMP中设计和合并的，允许移动设备同时在多访问通道中同时在所选位置上传模型更新。此外，我们理论上分析了FL算法的收敛性。在FL收敛分析的指导下，我们制定了联合传输概率和局部计算控制优化，旨在最大程度地减少FL移动设备的总体能源消耗（即迭代局部计算 +多轮通信）。广泛的仿真结果表明，我们提出的方案在频谱利用率，能源效率和学习准确性方面优于现有计划。

translated by 谷歌翻译

XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention

Jiacheng Shi , Yuting He , Youyong Kong , Jean-Louis Coatrieux , Huazhong Shu , Guanyu Yang , Shuo Li

分类：计算机视觉

2022-06-15

有效的骨干网络对于基于深度学习的可变形医学图像注册（DMIR）很重要，因为它可以提取和匹配两个图像之间的特征，以发现互联网的相互对应。但是，现有的深网关注单图像，并且在配对图像上执行的注册任务有限。因此，我们推进了一个新型的骨干网络Xmorpher，用于DMIR中有效的相应特征表示。 1）它提出了一种新颖的完整变压器体系结构，包括双重平行特征提取网络，通过交叉注意交换信息，从而在逐渐提取相应的特征以逐渐提取最终有效注册时发现了多层次的语义对应。 2）它推进了交叉注意变压器（CAT）块，以建立图像之间的注意机制，该图像能够自动找到对应关系并提示特征在网络中有效融合。 3）它限制了基本窗口和搜索不同尺寸的窗口之间的注意力计算，因此着重于可变形注册的局部转换，并同时提高了计算效率。我们的Xmorpher没有任何铃铛和哨子，可在DSC上提高2.8％的素孔，以证明其对DMIR中配对图像的特征的有效表示。我们认为，我们的Xmorpher在更多配对的医学图像中具有巨大的应用潜力。我们的Xmorpher在https://github.com/solemoon/xmorpher上开放

translated by 谷歌翻译

Multimodal Object Detection via Probabilistic Ensembling

Yi-Ting Chen , Jinghao Shi , Zelin Ye , Christoph Mertz , Deva Ramanan , Shu Kong

分类：计算机视觉

2021-04-07

使用多模式输入的对象检测可以改善许多安全性系统，例如自动驾驶汽车（AVS）。由白天和黑夜运行的AV动机，我们使用RGB和热摄像机研究多模式对象检测，因为后者在较差的照明下提供了更强的对象签名。我们探索融合来自不同方式的信息的策略。我们的关键贡献是一种概率结合技术，Proben，一种简单的非学习方法，可以将多模式的检测融合在一起。我们从贝叶斯的规则和第一原则中得出了探针，这些原则在跨模态上采用条件独立性。通过概率边缘化，当检测器不向同一物体发射时，概率可以优雅地处理缺失的方式。重要的是，即使有条件的独立性假设不存在，也可以显着改善多模式检测，例如，从其他融合方法（包括现成的内部和训练有素的内部）融合输出。我们在两个基准上验证了包含对齐（KAIST）和未对准（Flir）多模式图像的基准，这表明Proben的相对性能优于先前的工作超过13％！

translated by 谷歌翻译

PennyLane: Automatic differentiation of hybrid quantum-classical computations

Ville Bergholm , Josh Izaac , Maria Schuld , Christian Gogolin , Shahnawaz Ahmed , Vishnu Ajith , M. Sohaib Alam , Guillermo Alonso-Linaje , B. AkashNarayanan , Ali Asadi

分类：机器学习

2018-11-12

Pennylane是用于量子计算机可区分编程的Python 3软件框架。该库为近期量子计算设备提供了统一的体系结构，支持量子和连续变化的范例。 Pennylane的核心特征是能够以与经典技术（例如反向传播）兼容的方式来计算变异量子电路的梯度。因此，Pennylane扩展了在优化和机器学习中常见的自动分化算法，以包括量子和混合计算。插件系统使该框架与任何基于门的量子模拟器或硬件兼容。我们为硬件提供商提供插件，包括Xanadu Cloud，Amazon Braket和IBM Quantum，允许Pennylane优化在公开访问的量子设备上运行。在古典方面，Pennylane与加速的机器学习库（例如Tensorflow，Pytorch，Jax和Autograd）接口。 Pennylane可用于优化变分的量子本素体，量子近似优化，量子机学习模型和许多其他应用。

translated by 谷歌翻译

Path Aggregation Network for Instance Segmentation

Shu Liu , Lu Qi , Haifang Qin , Jianping Shi , Jiaya Jia

分类：

2018-03-05

The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature. We present adaptive feature pooling, which links feature grid and all feature levels to make useful information in each feature level propagate directly to following proposal subnetworks. A complementary branch capturing different views for each proposal is created to further improve mask prediction.These improvements are simple to implement, with subtle extra computational overhead. Our PANet reaches the 1 st place in the COCO 2017 Challenge Instance Segmentation task and the 2 nd place in Object Detection task without large-batch training. It is also state-of-the-art on MVD and Cityscapes. Code is available at https://github. com/ShuLiu1993/PANet.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided Mixup

Zhijing Yang , Junyang Chen , Yukai Shi , Hao Li , Tianshui Chen , Liang Lin

分类：计算机视觉

2023-01-03

Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities. Prior methods successfully preserve the character of clothing images, however, occlusion remains a pernicious effect for realistic virtual try-on. In this work, we first present a comprehensive analysis of the occlusions and categorize them into two aspects: i) Inherent-Occlusion: the ghost of the former cloth still exists in the try-on image; ii) Acquired-Occlusion: the target cloth warps to the unreasonable body part. Based on the in-depth analysis, we find that the occlusions can be simulated by a novel semantically-guided mixup module, which can generate semantic-specific occluded images that work together with the try-on images to facilitate training a de-occlusion try-on (DOC-VTON) framework. Specifically, DOC-VTON first conducts a sharpened semantic parsing on the try-on person. Aided by semantics guidance and pose prior, various complexities of texture are selectively blending with human parts in a copy-and-paste manner. Then, the Generative Module (GM) is utilized to take charge of synthesizing the final try-on image and learning to de-occlusion jointly. In comparison to the state-of-the-art methods, DOC-VTON achieves better perceptual quality by reducing occlusion effects.

translated by 谷歌翻译

Deep Spectral Q-learning with Application to Mobile Health

Yuhe Gao , Chengchun Shi , Rui Song

分类： (统计)机器学习 | 机器学习

2023-01-03

Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.

translated by 谷歌翻译